Session 4: Statistical Language Modeling

نویسنده

  • Aravind K. Joshi
چکیده

Corpus based Natural Language Processing (NLP) is now a well established paradigm in NLP. The availability of large corpora, often annotated in various way has led to the development of a variety of approaches to statistical language modeling. The papers in this session represent many of these important approaches. I will try to classify these papers along different dimensions, thus providing the reader an overview as well as some understanding of the future directions of the work in this area.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Dynamic Web log session identification with statistical language models

on statistical language modeling. Unlike standard timeout methods, which use fixed time thresholds for session identification, we use an information theoretic approach that yields more robust results for identifying session boundaries. We evaluate our new approach by learning interesting association rules from the segmented session files. We then compare the performance of our approach to three...

متن کامل

Session 2: Language Modeling

This session presented four interesting papers on statistical language modeling aimed for improved large-vocabulary speech recognition. The basic problem in language modeling is to derive accurate underlying representations from a large amount of training data, which shares the same fundamental problem as acoustic modeling. As demonstrated in this session, many techniques used for acoustic mode...

متن کامل

User Modeling of Parallel Workloads

The goal of workload modeling is to simulate the expected workload, accurately enough to enable making correct design and administrative decisions. Several statistical features of production parallel computer workloads, which are not embodied in current models, have been identified. Their practical importance is demonstrated by two new kinds of schedulers – a key component in determining the ov...

متن کامل

Session 11 - Natural Language III

The five papers in this session, as well as the ten papers in the other two natural language sessions, can be classified into three broad categories: (1) statistical approaches to natural language processing and the automatic acquisition of linguistic structure (2 out of 5 papers in this session; 8 out of 15 overall); (2) robust processing of texts by combining multiple partial analyses (2 out ...

متن کامل

Session 8: Statistical Language Modeling

Over the past several years, the successful application of statistical techniques in natural language processing has penetrated further and further into written language technology, proceding with time from the periphery of written language processing into deeper and deeper aspects of language processing. At the periphery of natural language understanding, Hidden Markov Models were first applie...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1992